# A 55nm Ultra Low Leakage Deeply Depleted Channel Technology Optimized for Energy Minimization in Subthreshold SRAM and Logic

Harsh N. Patel<sup>\*</sup>, Abhishek Roy, Farah B. Yahya, Ningxi Liu, Benton Calhoun Department of Electrical and Computer Engineering University of Virginia, Charlottesville, USA Email: \*hnpatel@virginia.edu

Abstract— This paper presents an Ultra-Low Leakage (ULL) 55nm Deeply Depleted Channel (DDC) process technology for low power Internet of Things (IoT) applications. The DDC ULL devices provide 67% reduction in threshold ( $V_T$ ) variation due to Random Dopant Fluctuation (RDF). Circuit techniques such as subthreshold operation and reverse body biasing (RBB) are codesigned with the technology to maximize the energy/power saving. A test chip implements a 1Kb 6T SRAM, an FIR filter, and a 51-stage RO to showcase how the technology works with circuit techniques to minimize energy. The 6T SRAM array operates reliably down to 200mV with a reduced leakage power of 7nW (85% lower compared to non-DDC devices). The FIR filter consumes just 4.5pJ/cycle operating at 0.36V at 200 KHz.

Keywords—DDC, ultra-low-leakage, body biasing, subthreshold and variation.

## I. INTRODUCTION

Ultra-low power (ULP) consumption and energy-efficient operation are the key requirements for systems catering to IoT applications such as embedded wireless sensors, wearable health monitoring devices, and other similar Body Sensor Network (BSN) applications. In such applications, power consumed by SRAM can dominate the total power of the system-on-chip (SoC) [1]. Scaling down the supply voltage (V<sub>DD</sub>) to subthreshold voltage levels reduces the active power, but reduced on-current and variations in device threshold voltage (V<sub>T</sub>) due to RDF limit V<sub>DD</sub> scaling and circuit functionality [2].

Process technology optimization is one promising path to enabling ULP operation. In [3], the authors demonstrated a 32nm High-K/Metal Gate (HK-MG) technology for low-power applications. The technology provides higher drive current with reduced off-current. However, it limits V<sub>DD</sub> to 1.0V or above. Similarly, a 45nm HK-MG process also targets highperformance applications [4][5]. The authors in [6] addressed the limitation of voltage scaling in bulk-CMOS by using extremely thin SOI (ETSOI) for low-power applications. The ETSOI [6] and Tri-gate FET [7] structures with selectively grown epitaxial channels after STI improve performance but do not address V<sub>T</sub> variation due to RDF [8][9]. None of these technological advancements allow 6T SRAM to operate in the subthreshold region or address subthreshold challenges stated in [1] and [2]. In this paper, we present a 55nm Deeply Depleted Channel (DDC) technology with Ultra-Low-Leakage (ULL) devices that is optimized for ULP subthreshold operation due to

Kazuyuki Kumeno, Makoto Yasuda, Akihiko Harada, Taiji Ema Mie Fujitsu Semiconductor Ltd., 2000 Mizono Kuwana, Mie, Japan

higher drive strength, reduced variation, and support for  $V_{\text{DD}}$  scaling for the SRAM and logic.

This paper combines technology and circuit solutions for energy efficient application needs. The proposed 55nm DDC ULL devices reduce  $V_T$  variation by fine-tuned control over the channel length while enabling Reverse Body Biasing (RBB) to minimize leakage, power, and energy for both SRAM and logic. We fabricated a testchip with a 1Kb 6T SRAM and an FIR logic accelerator (Fig. 14) to demonstrate the co-design of the technology with memory and logic circuits.

## II. DDC TECHNOLOGY ADVANTAGES AND LOW-POWER BENEFITS

In the subthreshold region, leakage energy often dominates the active energy. The total leakage current of a device consists of subthreshold, gate, and junction leakage. An increase in V<sub>T</sub> with an increase in the dosage of impurities in the channel region can minimize subthreshold current. However, increased impurities make RDF worse and increases junction leakage [9]. A DDC technology for 65nm is presented in [8] [9] with an optimal trade-off between V<sub>T</sub> variation and subthreshold leakage. In this paper, we introduce new ULL devices in a 55nm DDC technology targeting total leakage current reduction with RBB. Once subthreshold leakage is reduced sufficiently, gate leakage dominants the total leakage. The gate leakage strongly depends on the thickness of the gate dielectric  $(T_{OX})$ . However, thicker T<sub>OX</sub> leads to a larger V<sub>T</sub> variation and results in higher RDF and more V<sub>T</sub> mismatch between devices. However, with ULL DDC devices, the V<sub>T</sub> degradation with a thicker gate dielectric is relaxed by 60% compared with the conventional device at the same gate dielectric thickness as shown in Fig. 2.

Fig. 1 shows the device cross-section and a TEM of a 55nm ULL device in our DDC technology. The un-doped channel and highly doped screen layer reduce  $V_T$  variation in DDC [9]. The ULL device using DDC further reduces leakage using an optimal selection of channel lengths combined with body biasing. Fig. 3 shows measured I<sub>D</sub> vs V<sub>GS</sub> curves including local and global variations. The reduction in V<sub>T</sub> variation provides ratio-ed circuits such as SRAM with more stability and offers better leakage control for dynamic circuits like DRAM. Higher local and global variation disturbs the circuit functionality in subthreshold due to the exponential dependency of current on V<sub>T</sub>. Fig. 4 shows that the measured 55nm DDC V<sub>T</sub> variability is much less than for non-DDC technology. Fig. 5 shows V<sub>T</sub> roll-

off for a ULL device in the DDC technology compared to conventional standard (SVT) and Low (LVT)  $V_T$  devices in a non-DDC technology. ULL DDC shows a strong control over  $V_T$  across a wide range of channel lengths. Fig. 6 shows the measured  $V_T$  variation across a wide range of temperatures. The temperature dependency of device parameters is within 0.053 $\sigma$ .

#### III. SRAM AND LOGIC OPTIMIZATION WITH ULL DDC

We fabricated a 1Kb SRAM using a compact  $(0.865 \times 0.492 \mu m^2)$  6T bitcell from ULL devices. The triple well structure in DDC allows RBB to accentuate the inherent benefits of ULL devices for extra power savings at low  $V_{DD}$ , and device sizes are selected to improve margins at low  $V_{DD}$ . Fig. 7 shows the 6T bitcell leakage with applied RBB for the LVT and ULL devices. The ULL cells enable a higher degree of RBB that results in 75X leakage reduction over LVT. The LVT devices limit the higher degree of RBB as a result of an increase in junction current, whereas ULL devices reduce total leakage by controlling gate leakage and junction leakage. One of the challenges for subthreshold SRAM operation is the Read-Half Select issue that limits V<sub>DD</sub> scaling [13]. Fig. 8 shows the halfselect stability (read SNM) of our fabricated 6T SRAM bitcell. The ULL 6T bitcell allows stable read operations at  $V_{DD}=0.2V$ , compared to >0.4V for non-DDC devices. Most subthreshold SRAM bitcells use much larger non-6T topologies due to inadequate margins, so this stable 6T cell enables a much more compact solution for low V<sub>DD</sub> memory.

In the sub-threshold region where leakage energy dominates active energy, leakage reduction is critical. Fig.9 shows 98% standby leakage reduction for our 1Kb SRAM array with RBB at 0.2V as compared to no RBB. This will allow a 6T SRAM array to minimize the total energy using RBB in the subthreshold region. Fig. 10 shows the measured active energy and performance of SRAM at different degrees of applied RBB. Since the increase in applied RBB reduces the array leakage current significantly, the array achieves greater energy savings at subthreshold voltages, where leakage dominates, compared to at nominal V<sub>DD</sub>. The optimized DDC ULL devices allow a higher degree of leakage reduction while maintaining sufficient IoN in the subthreshold region as shown in Fig. 10. The use of the leakage optimized ULL devices with RBB multiply the power-energy benefits. Table 1 compares different low-power SRAM bitcells for energy, array V<sub>MIN</sub>, and area trade-offs. The energy number also depends on word size and therefore the proposed design implemented with 16-bit words has higher energy compared to [15] and [16] where word size is 32bits. The proposed 6T ULL bitcell shows highest energy/bit for given ISO area.



Fig. 1. TEM picture of a DDC ULL device



Fig. 2. Impact of increase in Gate-Oxide (Tox) on VT variation



Fig. 3. I<sub>D</sub> vs V<sub>GS</sub> across multiple samples and across process corners.



Fig. 4.  $V_T$  variation spread comparison of DDC and conventional (non-DDC) devices (Lg=60nm).



Fig. 5. VT roll-off comparison between DDC and non-DDC devices (W=1 $\mu$ m, VDs=0.9V)



Fig. 6. Reduced temperature sensitivity makes DDC an alternative for low-power IoT devices.



Fig. 7. 75X 6T bitcell leakage minimization using ULL devices that allow a higher degree of RBB.



Fig. 8. Butterfly curves for SRAM 6T bitcell: DDC ULL vs. non-DDC (conventional) bitcell



Fig.9. Standby leakage reduction with reverse body-biasing.



Fig. 10. SRAM energy and performance optimization using DDC ULL devices and RBB



Fig. 11. Ion degradation with increasing degree of RBB



Fig.12. Effectiveness of RBB on active energy: 16-bit FIR



Fig. 13. Active power reduction of 51 stage RO with applied RBB



Fig. 14. Fabricated chip with 1kb SRAM and 16-bit FIR block

ULL DDC devices with RBB also optimize digital circuits in subthreshold. To demonstrate the effectiveness of this combined effect on active power minimization, we consider a 16-bit, 32-tap FIR filter and a 51-state ring oscillator (RO). Fig.12 shows that the minimum energy per cycle for the FIR filter (at 0.36V) is ~5X lower than [12], and RBB of 0.25V gives 39.4% further reduction due to lower leakage energy. Fig. 13 shows the measured active power of a 51-stage RO across  $V_{DD}$ . The effectiveness of RBB is shown as a percentage of active power reduction with different degrees of body biasing, and RBB provides maximum active energy reduction at low  $V_{DD}$ . The result shows an interesting benefit of the subthreshold optimized process where power minimization using RBB is more advantageous at subthreshold voltages compared to at nominal voltages.

Table 1. Benchmark table with comparison of different technologies. <sup>+</sup>variation numbers are calculated from the figure provided in the reference.

|                                 | This work      | [4]       | [5]   | [6]  | [7]            |
|---------------------------------|----------------|-----------|-------|------|----------------|
| Lg (nm)                         | 55             | 65        | 55    | 65   | 130            |
| $V_{DD}\left(V\right)$          | 0.5 - 0.9      | 1.0       | 1.2   | 1    | 0.3-<br>0.36   |
| Ion (uA/um)<br>@V <sub>DD</sub> | 107/ 0.6       | 200/1     | 20/20 | NA   | NA             |
| SRAM V <sub>MIN</sub>           | 0.2            | 0.4       | NA    | 1    | NA             |
| $\Delta  V_T  (mV)$             | 70             | $100^{+}$ | 200+  | 220+ | NA             |
| FIR (Min<br>Energy/cyc)<br>(pJ) | 4.5 @<br>0.36V | NA        | NA    | NA   | 21.3@<br>0.31V |

|                           | This<br>work | [14]  | [15]                    | [16]                    | [17]          |
|---------------------------|--------------|-------|-------------------------|-------------------------|---------------|
| Tech.<br>(nm)             | 55           | 65    | 65                      | 65                      | 65            |
| Cell<br>Type              | 6T           | 8T    | 9Т                      | 14T                     | 8T            |
| Transistor<br>Type        | ULL          | NA    | Mixed<br>V <sub>T</sub> | High-<br>V <sub>T</sub> | Low-<br>Power |
| Array<br>V <sub>MIN</sub> | 0.2V         | 0.35V | 0.3V                    | 0.5V                    | 0.4V          |
| Energy<br>(fJ/bit)        | 31.25        | 870   | 18.2                    | 14                      | 78            |

# IV. CONCLUSION

In this paper, we presented a new 55nm ULL DDC technology that reduces total leakage current with less degradation of RDF (V<sub>T</sub> variation) across channel length and optimized for the energy for subthreshold operations. Use of ULL for subthreshold circuits with RBB reduces the leakage of 6T SRAM (by 98%), energy/cycle of SRAM (by 83%), active power of RO (by >80%), and energy/cycle reduction of FIR (by 5X). Table I summarizes the advantages and compares the 55-nm DDC technology with other similar technology nodes.

#### ACKNOWLEDGMENT

The authors acknowledge Dilip Vasudevan, Luis Ruiz, and Mie Fujitsu Semiconductor for their support.

#### REFERENCES

- A. Klinefelter *et al.*, "21.3 A 6.45µW self-powered IoT SoC with integrated energy-harvesting power management and ULP asymmetric radios," *ISSCC*, 2015.
- [2] H. N. Patel, F. B. Yahya, and B. H. Calhoun, "Optimizing SRAM Bitcell Reliability and Energy for IoT Applications," – in press, *ISQED*, 2016.
- [3] X. Chen et al., "A cost effective 32nm high-K/ metal gate CMOS technology for low power applications with single-metal/gate-first process," VLSI Technology, 2008 Symposium on, Honolulu, HI, 2008.
- [4] F. Hamzaoglu et al., "A 153Mb-SRAM Design with Dynamic Stability Enhancement and Leakage Reduction in 45nm High-K Metal-Gate CMOS Technology," ISSCC 2008.
- [5] G. Tsutsui et al., "Reduction of Vth variation by work function optimization for 45-nm node SRAM cell," VLSI Technology, 2008.
- [6] K. Cheng *et al.*, "ETSOI CMOS for system-on-chip applications featuring 22nm gate length, sub-100nm gate pitch, and 0.08µm2 SRAM cell," VLSI Circuits Symp., 2011.
- [7] K. J. Kuhn, "CMOS Scaling for 22nm Node and Beyond: Device Physics and Technology," VLSI-TSA, April 2011.
- [8] L. T. Clark et al., "A highly integrated 65-nm SoC process with enhanced power/performance of digital and analog circuits," *IEDM*, 2012.
- [9] K. Fujita *et al.*, "Advanced channel engineering achieving aggressive reduction of VT variation for ultra-low-power applications," *IEDM*, 2011
- [10] N. Kimizuka *et al.*, "Ultra-low standby power (U-LSTP) 65-nm node CMOS technology utilizing HfSiON dielectric and body-biasing scheme," VLSI Technology, 2005.
- [11] M. Yamaoka, N. Maeda, Y. Shimazaki and K. Osada, "65nm Low-Power High-Density SRAM Operable at 1.0V under 3σ Systematic Variation Using Separate Vth Monitoring and Body Bias for NMOS and PMOS," *ISSCC 2008*.
- [12] W. H. Ma, J. C. Kao, V. S. Sathe and M. Papaefthymiou, "A 187MHz subthreshold-supply robust FIR filter with charge-recovery logic," VLSI Circuits, 2009 Symposium on, Kyoto, Japan, 2009.
- [13] H. N. Patel, F. B. Yahya and B. H. Calhoun, "Improving Reliability and Energy Requirements of Memory in Body Sensor Networks," VLSID, Kolkata, 2016
- [14] N. Verma, A. P. Chandrakasan, "A 256 kb 65 nm 8T Subthreshold SRAM Employing Sense-Amplifier Redundancy", JSSCC 2008.
- [15] S. Lutkemeier, T. Jungeblut, H.K.O. Berge, S. Aunet, M. Porrmann, U. Ruckert, "A 65nm 32 b Subthreshold Processor with 9T Multi-Vt SRAM and Adaptive Supply Voltage Control", *JSSCC* 2013.
- [16] P. Meinerzhagen, O. Andersson, B. Mohammadi, Y. Sherazi, A. Burg, J. N. Rodrigues, "A 500 fW/bit 14 fJ/bit-access 4kb standard-cell based sub-VT memory in 65nm CMOS", ESSCIRC, 2012.
- [17] M. E. Sinangil, N. Verma, A. P. Chandrakasan, "A reconfigurable 65nm SRAM achieving voltage scalability from 0.25–1.2V and performance scalability from 20kHz–200MHz", ESSCIRC 2008